We identify 45,265 non-owner-occupied residential parcels in the city of Milwaukee. This is our universe of “landlord-owned properties.” After applying some string standardization, there are 27,212 unique owner names and 23,064 unique owner mailing addresses.
We combine names and addresses into connected networks as follows:
Step 1: take an owner name and identify all the mailing addresses it uses. There is a maximum of 1 mailing address per parcel.
Step 2: find all the owner names used at any of the addresses identified in step 1.
Step 3: find all the addresses used by any of the owner names identified in step 2.
Step 4: repeat these steps until all matches are identified.
The result of this process is the MPROP Owner Group. Currently, we name the MPROP Owner Group after the most commonly used owner name in each group.
Because this approach exhausts all possible connections before moving on to the next owner name, it yields identical results no matter the order of owner names in which it proceeds. It is not possible for an owner name or address to be assigned to multiple groups.
Any MPROP Owner Group can be represented as a graph of owner names and addresses.
Here are the 10 properties associated with the “KENNETH R SIDELLO Group.”
Show the code
mprop %>%filter(mprop_group =="KENNETH R SIDELLO Group") %>%select(mprop_name, mprop_address)
# A tibble: 10 × 2
mprop_name mprop_address
<chr> <chr>
1 VINE STREET LOFTS LLC 4864 S 10TH ST, MILWAUKEE WI
2 WARREN CHATEAU 4864 S 10TH ST, MILWAUKEE WI
3 LA FAMILIA 737 LLC 4864 S 10TH ST, MILWAUKEE WI
4 KENNETH SIDELLO 7200 WHIPPOORWILL CT, WIND LAKE WI
5 STEVE SIDELLO 7200 WHIPPOORWILL CT, WIND LAKE WI
6 KENNETH R SIDELLO 7200 WHIPPOORWILL CT, WIND LAKE WI
7 KENNETH R SIDELLO 7200 WHIPPOORWILL CT, WIND LAKE WI
8 KENNETH R SIDELLO 4864 S 10TH ST, MILWAUKEE WI
9 KENNETH R SIDELLO 7200 WHIPPOORWILL CT, WIND LAKE WI
10 STEVE SIDELLO 7200 WHIPPOORWILL CT, WIND LAKE WI
And here is a visual representation of the graph. The size of each dot indicates how commonly that node appeared in the network. In this case, the owner KENNETH R SIDELLO is connected to two addresses. The other 5 owner names are all also connected to one of those two addresses. We can see that 1 of the other names is just Ken Sidello’s name without the middle initial, while another, STEVE SIDELLO, might be a relative.
Here is another kind of network, that of the VB ONE LLC Group. As the graph shows, most of the nodes in this large network are either just various ways of writing the same address or mispellings of “VB ONE LLC.” The handful of other LLC names include VB EIGHT LLC and VB SIX LLC, both of which are connected to VB ONE LLC is multiple ways.
Show the code
mprop_network_graph(mprop, "VB ONE LLC Group")
This method of matching parcels based on owner names and addresses yields 20,440 distinct owner groups, versus 27,212 unique individual owner names.
The table below shows the count of MPROP owner groups and the parcels they own by overall size.
Corporate registrations can also be used to match parcel ownership, but making meaningful connections is more difficult because only registered agent names and addresses are available. Many landlords use generic LLC registration services, making much of this data useless.
To select only useful cases from the WDFI data, I subset the WDFI data as follows:
I remove any non-currently registered corporations (I leave in delinquent, but not terminated registrations, which are common).
I remove any registrations with the address of an agent manually determined to be non-useful.
I remove any registrations sharing an address with an agent whose name includes any of a set of keywords like LAWYER, TAX, or ACCOUNTING.
I identify those registrations whose corporate name matches an MPROP owner name value, after string standardization. I keep all of these matched corporations and also any other registrations which:
share the address of a matched registration AND
share the registered agent of a matched registration
After creating this universe of WDFI corporate registrations, I apply a matching routine similar in most respects to the MPROP matching routine described above. Instead of matching parcels by owner names and addresses, I match corporate registrations by registered agent names and addresses.
Another variation is that I provide a list of addresses which cannot be used to make a match even though the agent name may still be used for matching purposes. This happens commonly when an address is only made distinct by a missing suite number or when multiple meaningfully distinct registered agents use the same UPS mailing facility.
Show the code
# only keep networks connection multiple MPROP-matched corporationswdfi.connected <- wdfi |>filter(corp_mprop_match ==TRUE) |>group_by(agent_group) |>filter(n_distinct(corp_name_clean) >1) |>ungroup()# summary statistics for agent groups that match multiple MPROP ownerswdfi.agent.group.totals <- wdfi.connected |>group_by(agent_group) |>summarise(corporations =n(),agents =n_distinct(agent_name),addresses =n_distinct(agent_address))
4,239 currently-registered corporations matched an MPROP owner name value, and an additional 4,910 corporate registrations did not appear in MPROP but did share a registered agent name and an agent address with a true MPROP match. This latter group can still be used to make connections between matched MPROP owners.
In total, 2,207 matched MPROP owners were connected into one of 574 WDFI Agent Groups.
WDFI Agent Groups can be visualized using network graphs in the same style as MPROP Owner Groups.
Here is the graph of the SAM STAIR Group agent network, which includes 19 separate corporate filings, 4 agent names, and 1 address. All the agent names are variations on the same name, “Sam Stair.”
We match MPROP ownership groups with WDFI agent groups by merging the MPROP parcel data with the WDFI corporate registration data where the parcel owner is the same as the WDFI registrant.
We then summarize the data to reveal each unique connection between an MPROP Owner group and a WDFI Agent Group, keeping just those connections where the WDFI agent network connects parcels that were not already connected by the MPROP Owner Network. Here is the result.
Show the code
mprop.groups.joined.by.wdfi.group <- mprop %>%group_by(mprop_name, mprop_group) %>%summarise() %>%ungroup() %>%inner_join(wdfi.connected, by =c("mprop_name"="corp_name_clean")) %>%group_by(agent_group) %>%# filter results for instances where multiple owner_groups are matchedfilter(n_distinct(mprop_group) >1) %>%group_by(agent_group, mprop_group) %>%summarise() %>%ungroup()reactable(mprop.groups.joined.by.wdfi.group, searchable = T, showPageSizeOptions = T,defaultPageSize =5, pageSizeOptions =seq(5,20,5))
It is most common for an agent group to connect multiple owner groups, but the reverse can also occur. It is possible for an owner group to connect multiple agent groups. Some networks consist of multiple agent and owner groups. To identify all networks, I use a custom function to identify all network connections, as shown below.
Show the code
# connect each node to its network group by assigning a numeric IDf <-function(x, d){ m <-merge(d, d[d$agent_group %in% x, "mprop_group"])if(length(unique(m$agent_group)) ==length(x)) {return(m %>%group_by_all() %>%summarise(.groups ="drop") %>%mutate(group_contents =paste(sort(c(unique(mprop_group), unique(agent_group))), collapse ="; "))) } else {f(unique(m$agent_group), d) }}nodes.to.ids <-map_df(unique(mprop.groups.joined.by.wdfi.group$agent_group), f,d = mprop.groups.joined.by.wdfi.group, .progress = T) %>%arrange(group_contents) %>%group_by(group_contents) %>%mutate(rownum =row_number()) %>%ungroup() %>%mutate(wdfi_mprop_group_id =cumsum(rownum ==1)) %>%select(mprop_group, agent_group, wdfi_mprop_group_id)reactable(nodes.to.ids, searchable = T, showPageSizeOptions = T,defaultPageSize =5, pageSizeOptions =seq(5,20,5))
The combined MPROP & WDFI networks are then incorporated with the orignal MPROP Owner Groups to create a final_group name.
Show the code
combined.wdfi.mprop.groups <- nodes.to.ids %>%group_by(mprop_group, wdfi_mprop_group_id) %>%summarise(.groups ="drop") %>%right_join(mprop) %>%group_by(wdfi_mprop_group_id) %>%# rename group with most frequent mprop_name valuemutate(final_group =paste(names(which.max(table(mprop_name))), "Group")) %>%ungroup() %>%# preserve old mprop_group name if the group is still the original mprop-derived groupmutate(final_group =if_else(is.na(wdfi_mprop_group_id), mprop_group, final_group)) %>%left_join(wdfi.connected, by =c("mprop_name"="corp_name_clean")) %>%mutate(final_group_source =if_else(is.na(wdfi_mprop_group_id),"mprop", "combo")) %>%select(TAXKEY, mprop_name, mprop_address, agent_name, agent_address, final_group, final_group_source)reactable(combined.wdfi.mprop.groups, searchable = T, showPageSizeOptions = T,defaultPageSize =5, pageSizeOptions =seq(5,20,5))
The graphs of these final property groups can be visualized in the same way as the MPROP Owner Groups. Shapes still indicate whether the node is an address or name. Now, for groups where the WDFI agent data was incorporated, color indicates if the name or address corresponds to an agent or a parcel owner.
This example shows the graph for the MKE Estates LLC Group, which is created out of a combination of MPROP and WDFI connections.
The MPROP connections are shown in red. The network manages to connect two MPROP parcel owners, 1905308 WATER LLC and MKE ESTATES LLC. The addresses used by these two owners in MPROP are typed inconsistently, so the MPROP matching routine doesn’t catch them. This may be a case where future improvements to text match will fix this. Still, this is a useful illustration of how the WDFI network matching can work.
The registered agent for 1905308 WATER LLC is MICHAEL PIAS.
The address for MICHAEL PIAS is 19295 ALTA VISTA CR BROOKFIELD.
A registered agent named MICHAEL PATRICK PIAS also uses the address 19295 ALTA VISTA CR BROOKFIELD.
MICHAEL PATRICK PIAS is the registered agent for MKE ESTATES LLC.
Show the code
final_network_graph <-function(data, final_group_name, seed =42, layout_algo ="lgl"){ parcels <- data %>%filter(final_group == final_group_name)# if it is an original mprop group, make that network# otherwise, make a network graph showing agent connectionsif(unique(parcels$final_group_source) =="mprop"){ data %>%rename(mprop_group = final_group) %>%mprop_network_graph(mprop_group_name = final_group_name) } else { parcels <- data %>%filter(final_group == final_group_name) %>%select(TAXKEY, mprop_name, mprop_address, agent_name, agent_address) %>%# agent names & addresses can be identical to mprop names/addresses# but they are separate nodes, so add distinguishing suffixmutate(agent_name =if_else(!is.na(agent_name),paste(agent_name, "a", sep ="-"),NA_character_),agent_address =if_else(!is.na(agent_address),paste(agent_address, "a", sep ="-"),NA_character_))# build the connections between mprop names/addresses and agent names/addresses connections <-bind_rows( parcels %>%pivot_longer(cols =-TAXKEY) %>%separate(name, into =c("type", "name_addr"), sep ="_") %>%filter(!is.na(value)) %>%pivot_wider(names_from = name_addr, values_from = value) %>%select(from = name, to = address),# these are the connections between agent names and mprop names parcels %>%filter(!is.na(agent_name)) %>%select(from = mprop_name, to = agent_name) ) node.frequency <- parcels %>%pivot_longer(cols =everything(), names_to ="type", values_to ="name") %>%group_by(name) %>%summarise(node_frequency =n()) graph <-as_tbl_graph(connections) %>%# add attributesmutate(node_class =case_when( name %in% parcels$mprop_name ~"owner name", name %in% parcels$mprop_address ~"mprop address", name %in% parcels$agent_name ~"agent name", name %in% parcels$agent_address ~"agent address" )) %>%inner_join(node.frequency)set.seed(seed)ggraph(graph, layout = layout_algo) +geom_edge_link() +geom_node_point(aes(color = node_class, shape = node_class, size = node_frequency)) +geom_node_label(aes(label =str_wrap(str_remove(name, "-a\\b"), 20)), size =if_else(nrow(node.frequency) >75, 1, 2),repel = T, segment.color ="gray75", min.segment.length =0.25,segment.size =0.25,fill =alpha(c("white"),0.5), max.overlaps =50) +scale_colour_manual(name ="Node type",labels =c("agent address", "agent name", "mprop address", "mprop name"),values =c("blue", "blue", "red", "red")) +scale_shape_manual(name ="Node type",labels =c("agent address", "agent name", "mprop address", "mprop name"),values =c(19, 17, 19, 17)) +guides(size ="none") +labs(title =paste("Graph of the", final_group_name, "ownership network"),subtitle =paste("This network includes", nrow(parcels), "parcels,",n_distinct(parcels$mprop_name), "owner names",n_distinct(parcels$mprop_address), "owner addresses,",n_distinct(parcels$agent_name), "agent names, &",n_distinct(parcels$agent_address), "agent addresses.")) +theme_graph() +theme(legend.position ="top",legend.title =element_blank()) }}final_network_graph(combined.wdfi.mprop.groups, "MKE ESTATES LLC Group")
Here is an example of a more complex network, successfully connected by registered agent name. According to LinkedIn, Amir Erez is the CEO of Fair Deal Home Buyers (a cash homebuying company) and the managing partner of A&E Investment & Management.
As the network graph below shows, Amir Erez in his capacity as a registered agent links two MPROP networks covering these companies.
Show the code
final_network_graph(combined.wdfi.mprop.groups, "CREAM CITY LOFTS LLC Group")
Finally, here is an example of a large network, featuring multiple more tenuous connections. I suspect some of these don’t represent genuine connections, so further investigation is in order.
I’m worried I might actually need to learn some graph theory…
Show the code
final_network_graph(combined.wdfi.mprop.groups, "STEWART G FRIEND Group", seed =4)